2a07

Introduction
FOXP2 is a transcription factor containing a winged-helix DNA binding domain, and is necessary for proper development of the lungs and brain. It is located on human chromosome 7 (7q31), and comprises 715 amino acids in its major splice form. FOXP2 is a member of the FOX (forkhead box) family and FOXP (FOXP1-4) subfamily of proteins, all of which contain a 90 amino acid winged-helix DNA binding motif. The foxP subfamily contains motifs like a glutamine-rich region, a leucine zipper, a zinc finger, and a forkhead domain, and is notable for conducting domain swapping.

About this Structure
2A07 is a 10 chains structure of sequences from Homo sapiens. Full crystallographic information is available from OCA.

Evolutionary Conservation
Because FOXP2 is the first protein to be solidly linked to a human speech disorder, much investigation has focused on its evolutionary history in hopes of elucidating the molecular mechanism behind the uniqueness of human speech. Evidence suggests that the protein underwent a recent selective sweep within the same time frame as the purported rise of language in modern humans, supporting this idea. Interestingly, molecular data also shows neandertals and modern humans share the same sequence. Chimpanzee, gorilla, and rhesus monkey FOXP2 sequences are identical. The human version differs from these primate versions by two amino acid changes: a threonine to asparagine at position 303 and asparagine to serine at position 325. Researchers have hypothesized that the human-specific change at 325 creates a potential phosphorylation site by protein kinase C, with a predicted corresponding change in the protein's transcriptional regulation. Experiments in mouse and human neuronal cell models suggest that these human-specific changes play an important role in vocalization and neural function.

Overall Structure
The asymmetric unit of FOXP2 contains 6 copies of the FOXP2 forkhead domain and 2 double stranded segments of DNA. All 6 copies are identical in sequence. 2 copies exist in monomeric form and 4 other copies exhibit domain swapping. The monomers bind intimately to equivalent sites on the 2 segments of DNA, whereas the two swapped dimers loosely associate with DNA. The DNA-bound monomeric form folds into the canonical winged-helix motif characteristic of the FOX family. Its core is comprised of three stacking alpha helices (H1, H2, and H3) capped at one end by a three stranded anti-parallel beta sheet (S1, S2, and S3). The turn between H2 and H3 contains a helix (H4) as seen in other FOX proteins. Between strands S2 and S3, conventional FOX proteins contain a 5-7 amino acid insert, called wing 1. However, in FOXP2 this insert is truncated, resulting in a simple type I turn that joins strands S2 and S3. The C-terminal region also distinguishes the FOXP subfamily from most other FOX proteins. In FOXA3 this region forms an extended loop, called wing 2 (W2) that contacts DNA extensively. The corresponding region in FOXP2 forms a helix (H5) that runs atop H1 and terminates at the DNA phosphate backbone. A similar helix H5 is also observed in theNMR structures of FOXD1 and FOXK1a (ILF-1), but the sequences and trajectories of these helices are notably different from that of FOXP2. The heightened variability of the W1 and W2 regions relative to the rest of the forkhead domain across all FOX subfamilies suggests that the wings may have specialized functions within each subfamily.

DNA Recognition
DNA recognition in FOXP2 can be defined as an interaction between various residues of the protein and the backbone of the DNA. This interaction helps the protein/DNA complex to remain stable.

Role of protein
Helix H3 has a major role to play in DNA recognition by FOXP2. On the H3 helix various interactions are seen, such as: 1. Bidentate hydrogen bonds. 2. Direct or water-mediated hydrogen bonds. 3. Van der Waals contacts. Other parts of the protein, such as, helix H1 and H2, strand S2 and S3 and the N and C termini of the protein also interact with the DNA.

DNA binding site
The DNA binding site of the FOXP2 can be defined as 5’-CAAATT-3’ (the core binding sequence is in bold). This site is defined based on the major groove contacts between the protein and the DNA. This was also found to be similar to that derived from in vitro selection.

Stability of the FOXP2/DNA complex
The FOXP2/DNA complex is stable, due to the following interactions. 1.	Asn550 forms bidentate hydrogen bonds with Ade10. 2.	His554 and Arg553 form direct or water-mediated hydrogen bonds with Thy10’ and Thy11’ respectively. 3.	The main chain and side chain atoms of Arg553, His554, Ser557, and Leu558 also make extensive van der Waals contacts to Cyt8, Gua8’, Thy9’, Thy10’, Thy11’, Ade12’, and Ade13’. 4.	A number of aromatic or hydrophobic residues from helix H1 (Tyr509), H2 (Leu527 and Tyr531), and strand S3 (Trp573) interact extensively with both helix H3 and the sugar-phosphate backbone. 5.	In the periphery of the FOXP2/DNA interface, residues from the N and C termini (Arg504, Thr508, Arg583, and Arg584), including the main chain amide of Tyr509 at the N-terminal end of helix H1 and residues from S2 (e.g., Arg564), make hydrogen bonds, van der Waals contacts, and electrostatic interactions with the DNA backbone, providing further stability to the FOXP2/DNA complex.

The DNA-contacting residues on H3 are almost absolutely conserved; this DNA binding behavior is likely common to all FOX proteins, which do not recognize a single consensus sequence but rather a degenerate pattern: 5’-RYMAAYA-3’, where, R=A or G; Y=C or T; M=A or C.

When compared to other FOX family proteins, ex: FOXA3, only one wing of FOXP2 interacts and makes limited DNA contacts. Whereas, FOXA3 has two loops (wing1 and wing2) to bind the DNA backbone and minor groove extensively. Hence, it has been observed that FOXP2 binds DNA with a lower affinity than that of FOXA3.

Domain Swapping
In the case of FOXP2, two monomers of FOXP2 exchange helix H3 and strands S2 and S3. S2 and S3 of 1 monomer interacts with H3 of another monomers and forms a dimer, which creates a single straight 15 amino acid sequence. In the experimentation of FOXP2, they took artificially mutated proteins similar to FOXP2, and found no domain swapping. Inferring that domain swapping is a structural feature. Also, the Alanine was strategically replaced by a Proline residue at Location 539 and it did not exhibit dimerism. Clearly, showing that Alanine supports dimerism.

Disease Mutations
Disease mutations in FOXP2 and related proteins correspond to either the domain-swapping dimer interface or the DNA binding sequence. FOXP2 and its family member FOXP3 are similar enough that the crystal structure of one can be used to explain the effects of mutation on structure in the other. In FOXP3, mutations at Ile363Val, Phe371Cys, Phe371Leu, Ala384Thr, and Arg397Trp have been traced to the autoimmune disorder IPEX. With the exception of the Arg553His mutation, all of the following amino acid numbers in the titles refer to FOXP3.

Arginine to Histidine at 553
This is the only mutation that has been characterized in the original FOXP2 protein. Arg553 is a major component in the binding of FOXP2 to DNA. It creates a hydrogen bond with Thy11' and van der Waals contacts with Thy 11', Ade12', and Ade13'. A mutation to a histidine disrupts the interface between the protein and DNA, ostensibly preventing or reducing binding affinity.

Isoleucine to Valine at 363
Ile363 in FOXP3 corresponds to Ile530 in FOXP3. The Ile530Val mutation alters DNA binding by influencing neighboring residues that contact DNA. Ile530 (H2) creates van der Waals contacts with Leu527, Leu556, and Trp 573. Leu527 and Trp573 in turn directly contact the DNA backbone, while Leu556 holds helix 3 in place to facilitate DNA recognition. The removal of a single methyl group when isoleucine is changed to valine destabilizes the entire binding interface.

Alanine to Threonine at 384
Ala384 in FOXP3 correponds to Ala551 in FOXP2. Mutation of alanine to threonine at position 384 (H3) introduces an extra methyl group, causing steric interference in the restrictive protein/DNA juncture and disrupting DNA binding.

Arginine to Tryptophan at 397
Arginine 397 in FOXP3 corresponds to Arg564 in FOXP2. Arg564 (S2) binds moieties on the DNA backbone and in the minor groove. Mutation of Arg564 to a bulky tryptophan destabilizes protein/DNA interaction through steric hindrance.

Phenylalanine to Cystine at 371 and Phenylalanine to Leucine at 371
Phe371 in FOXP3 corresponds to Phe538 in FOXP2. Phe538 is not located near the DNA binding face, and so probably plays little to no role in direct DNA binding. It is, however, at the center of the domain-swapped dimer interface. Mutations at this Phe may then cause disease by disturbing the core of the domain-swapped dimer.